Probabilistic Data Integration Systems

نویسندگان

  • Dan Suciu
  • Nilesh Dalvi
  • Brian Harris
  • Brent Louie
  • Chris Re
چکیده

Current data integration techniques are successful at managing well-defined and wellunderstood data integration tasks, but do not cope well with uncertainty. However, the amount of uncertain data is growing with the number and variety of data sources being integrated, both in traditional data integration tasks s.a. enterprise data integration, and in next generation integration problems, s.a. combining structured data with information extracted automatically from text. We argue that uncertainty in data integration is best coped with by using a probabilistic data model, and propose here probabilistic data integration systems, PDIS. Our justification comes from the fact that probabilities can be used to model a rich variety of imprecisions arising in information integration. For example:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uncertainty in data integration systems: automatic generation of probabilistic relationships

We propose a method for the automatic discovery of probabilistic relationships in the environment of data integration systems. Dynamic data integration systems extend the architecture of current data integration systems by modeling uncertainty at their core. Our method is a probabilistic word sense disambiguation (PWSD), which allows to automatically lexically annotate (i.e. annotation w.r.t. a...

متن کامل

Probabilistic Contaminant Source Identification in Water Distribution Infrastructure Systems

Large water distribution systems can be highly vulnerable to penetration of contaminant factors caused by different means including deliberate contamination injections. As contaminants quickly spread into a water distribution network, rapid characterization of the pollution source has a high measure of importance for early warning assessment and disaster management. In this paper, a methodology...

متن کامل

Chapter 1 UNCERTAINTY IN DATA INTEGRATION

Data integration has been an important area of research for several years. In this chapter, we argue that supporting modern data integration applications requires systems to handle uncertainty at every step of integration. We provide a formal framework for data integration systems with uncertainty. We define probabilistic schema mappings and probabilistic mediated schemas, show how they can be ...

متن کامل

Chapter 7 UNCERTAINTY IN DATA INTEGRATION

Data integration has been an important area of research for several years. In this chapter, we argue that supporting modern data integration applications requires systems to handle uncertainty at every step of integration. We provide a formal framework for data integration systems with uncertainty. We define probabilistic schema mappings and probabilistic mediated schemas, show how they can be ...

متن کامل

Probabilistic quorum systems for dependable distributed data management

Among failure-prone and dynamic distributed systems there is a significant class of systems that strive for high availability and can function with inconsistent data. Examples include flight reservation systems which allow overbooking or emergency ambulance systems which return informative responses to time-critical queries. Data replication is a well-known technique for tolerating failures and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006